Needed for rendering surface images in RMarkdown.

library(rgl)
rgl::setupKnitr()

# Sometimes the first OpenGL window does not render properly.
rgl::open3d(); rgl::close3d()
## glX 
##   1

This demo will introduce you to the BayesfMRI package and its main function, BayesGLM. This function performs spatial Bayesian modeling on task fMRI data.

The spatial modeling used in BayesfMRI is grayordinate-based, meaning that the spatial priors underlying the models are surface-based and subcortical parcel constrained. This is designed to respect neuroanatomy and is analogous to smoothing along the cortical surface or within specified subcortical volumetric parcels, and avoid mixing of distinct signals.

The BayesfMRI package is designed with CIFTI data in mind, since CIFTI data is in grayordinates format. It is also compatible with GIFTI (surface only) and NIFTI (volumetric) format data, but those require putting the data in the correct format first. For NIFTI data, the package is designed to model smaller subcortical structures, rather than whole-brain volumetric data.

Prerequisites

First, make sure you have updated R (this code was built on R 4.4.0) and RStudio. We will also be using a number of different R packages. To install packages from CRAN, simply use install.packages(). To install packages from Github, you will need to have the devtools package installed, and use devtools::install_github(). To install from a specific Github branch, use the ref argument in install_github().

  1. Since the BayesfMRI package is designed primarily to work with CIFTI-format data, it requires the ciftiTools package, which relies on the Connectome Workbench. You will need to install the Connectome Workbench on your local machine, and note the file path where it is located. You can then install ciftiTools from CRAN or Github.

  2. The spatial Bayesian modeling underlying BayesGLM relies on the INLA package. This package must be installed following the instructions here: https://www.r-inla.org/download-install. Because INLA has many dependencies, most of which we won’t need, you can try running the install with dep=FALSE. In that case, you may need to install two required dependencies: sp and Matrix. If you are working on Linux, you may need to install the appropriate binaries via inla.binary.install().

  3. Once you have installed ciftiTools and INLA, go ahead and install BayesfMRI. This demo was built on version 8.0, which can be installed from Github – see the commented-out code below.

# install.packages('ciftiTools')
# install.packages("INLA",repos=c(getOption("repos"),INLA="https://inla.r-inla-download.org/R/stable"), dep=FALSE)
# devtools::install_github('mandymejia/BayesfMRI', ref=`8.0)
library(ciftiTools)
ciftiTools.setOption('wb_path','/Applications') #where your Connectome Workbench installation is located
## Using this Workbench path: '/Applications/workbench/bin_macosx64/wb_command'.
library(INLA) #load INLA to ensure it was installed properly
library(BayesfMRI)

Locating the Data

We will be analyzing the HCP emotion task for one subject. In this task, subjects were shown fearful and neutral visual stimuli. We will be modeling both cortical surfaces, as well as two subcortical structures: the left and right amygdala, which is associated with fear. For the cortial surface, for computational efficiency we previously resampled to approximately 10,000 vertices per hemisphere from the original ~32,000 resolution, using ciftiTools::resample_cifti().

For any task fMRI analysis, we need at a minimum:

  1. The task fMRI BOLD data, preferably in CIFTI format

  2. A set of onsets and durations for each task/stimulus, OR a pre-computed task design matrix

  3. A set of nuisance regressors, e.g. motion realignment parameters

We will start by reading in each of these.

getwd()
## [1] "/Users/afmejia/Documents/Github/BayesfMRI/vignettes"
dir_data <- '../vignette_data/session1'
setwd(dir_data)
fname_BOLD <- 'BOLD_10k.dtseries.nii'
fname_motion <- 'Movement_Regressors.txt'
fname_events <- c('EVs/fear.txt', 'EVs/neut.txt')

if(!file.exists(fname_BOLD)) stop('BOLD data not found, check your file paths')
if(!all(file.exists(fname_events))) stop('Task event data not found, check your file paths')
if(!file.exists(fname_motion)) stop('Nuisance regressors not found, check your file paths')

First, we read in the BOLD data. We can glimpse its structure, and grab the TR and length of the time series.

setwd(dir_data)
(BOLD_xii <- read_cifti(fname_BOLD, brainstructures = "all")) #use resamp_res to resample to a lower cortical resolution
## ====CIFTI METADATA===================
## Intent:           3002 (dtseries)
## - time step       0.72 (seconds)
## - time start      0
## Measurements:     176 columns
## 
## ====BRAIN STRUCTURES=================
## - left cortex     9394 data vertices
##                   848 medial wall vertices (10242 total)
## 
## - right cortex    9398 data vertices
##                   844 medial wall vertices (10242 total)
## 
## - subcortex       31870 data voxels
##                   subcortical structures and number of voxels in each:
##                     Cortex-L (0), Cortex-R (0),
##                     Accumbens-L (135), Accumbens-R (140),
##                     Amygdala-L (315), Amygdala-R (332),
##                     Brain Stem (3472),
##                     Caudate-L (728), Caudate-R (755),
##                     Cerebellum-L (8709), Cerebellum-R (9144),
##                     Diencephalon-L (706), Diencephalon-R (712),
##                     Hippocampus-L (764), Hippocampus-R (795),
##                     Pallidum-L (297), Pallidum-R (260),
##                     Putamen-L (1060), Putamen-R (1010),
##                     Thalamus-L (1288), Thalamus-R (1248),
##                     Other (0).
(TR <- BOLD_xii$meta$cifti$time_step)
## [1] 0.72
(nT <- ncol(BOLD_xii))
## [1] 176

Now read in and glimpse the motion regressors and task timing information.

setwd(dir_data)
motion <- as.matrix(read.table(fname_motion, header=FALSE))
head(motion)
##             V1       V2       V3        V4        V5       V6        V7
## [1,]  0.003379 0.077940 0.062644 -0.034607  0.000000 0.053629  0.000000
## [2,]  0.040860 0.113567 0.068549 -0.003839 -0.032200 0.042170  0.037481
## [3,]  0.003503 0.100619 0.114477  0.000000 -0.030596 0.051280 -0.037357
## [4,] -0.054154 0.075328 0.091263 -0.013923  0.000000 0.040966 -0.057657
## [5,] -0.018085 0.080291 0.076363  0.002521  0.000000 0.000000  0.036069
## [6,]  0.035357 0.102679 0.055283  0.000000  0.000000 0.033289  0.053442
##             V8        V9       V10       V11       V12
## [1,]  0.000000  0.000000  0.000000  0.000000  0.000000
## [2,]  0.035627  0.005905  0.030768 -0.032200 -0.011459
## [3,] -0.012948  0.045928  0.003839  0.001604  0.009110
## [4,] -0.025291 -0.023214 -0.013923  0.030596 -0.010314
## [5,]  0.004963 -0.014900  0.016444  0.000000 -0.040966
## [6,]  0.022388 -0.021080 -0.002521  0.000000  0.033289
events <- lapply(fname_events, read.table, header=FALSE)
names(events) <- c('fear', 'neut')
events
## $fear
##        V1 V2 V3
## 1  32.053 18  1
## 2  74.196 18  1
## 3 116.338 18  1
## 
## $neut
##       V1 V2 V3
## 1 10.982 18  1
## 2 53.125 18  1
## 3 95.267 18  1

Construct the Design Matrix

First, we construct the task design matrix using the make_design function (see help(make_design)). This function checks for collinearity between the predictors by returning and printing the maximum pairwise correlation and the variance inflation factor (VIF). If the VIF exceeds five, it is important to examine the design matrix for sources of multicollinearity.

Notice the format of the EVs argument. The events object above is an example of correct formatting: a list, with each list element representing one task. The names of the list are the task names, and each list element is a matrix. The first column is the stimulus start time, and the second columns is the duration. In the HCP data there is a third column, but this will be ignored.

design <- make_design(EVs=events, nTime = nT, TR = TR)
## Maximum corr.:  -0.782 
## Maximum VIF:    2.57
plot(design, colors = c('red','black'))

Including HRF Derivatives

We can optionally set dHRF = 1 to include the temporal derivative of the HRF or dHRF = 2 to include both temporal and dispersion derivatives. This allows for small shifts in the timing of the HRF.

design_dHRFs <- make_design(EVs=events, nTime = nT, TR = TR, dHRF = 1)
## Maximum corr.:  0.061 
## Maximum VIF:    2.59
plot(design_dHRFs, colors = c('red','black','pink','grey'))

In addition to the design matrix itself, this function returns additonal information, including the hemodynamic response function (HRF) convolved with the stimulus response function,

names(design)
## [1] "design"      "field_names" "dHRF"        "HRF_info"    "diagnostics"
head(design$design) #the actual design matrix
##               fear         neut
## [1,] -2.530754e-14 2.715097e-14
## [2,] -2.536246e-14 2.751692e-14
## [3,] -2.560462e-14 2.743907e-14
## [4,] -2.560182e-14 2.766485e-14
## [5,] -2.627375e-14 2.798314e-14
## [6,] -2.600294e-14 2.839620e-14

Constructing Contrasts

In the design matrix above, there is one regressor for each task. If we are interested more in the contrast between tasks, we can modify the design matrix. If \(x_1\) and \(x_2\) are two separate tasks that we wish to contrast, then we can construct two columns: \(w_1 = x_1 + x_2\) and \(w_2 = \frac{1}{2}(x_1 - x_2)\) (or \(\frac{1}{2}(x_2 - x_1)\), depending on the direction of the contrast of interest). Then our modified design matrix consists of two columns, \(w_1\) and \(w_2\). The coefficient associated with \(w_1\) is the average activation across both tasks, and the coefficient associated with \(w_2\) is the difference in activation with task \(x_1\) versus task \(x_2\).

For example, we may be interested in the contrast “fear minus neutral”. In that case, we can modify our design matrix as follows:

design0 <- design$design #the one returned by make_design
head(design0)
##               fear         neut
## [1,] -2.530754e-14 2.715097e-14
## [2,] -2.536246e-14 2.751692e-14
## [3,] -2.560462e-14 2.743907e-14
## [4,] -2.560182e-14 2.766485e-14
## [5,] -2.627375e-14 2.798314e-14
## [6,] -2.600294e-14 2.839620e-14
avg_fear_neut <- design0[,1] + design0[,2]
fear_vs_neut <- (design0[,1] - design0[,2])/2
design1 <- cbind(avg_fear_neut, fear_vs_neut)
plot_design(design1, colors = c('red','black'))

Call BayesGLM for model fitting

Now we are ready to call the BayesGLM function! This function will fit a spatial Bayesian model to our task fMRI data. It will also fit a classical, “massive univariate” model, to serve as a benchmark and comparison. To skip the Bayesian modeling and only fit the classical GLM, set Bayes = FALSE.

The BayesGLM function provides a few additional options:

Temporal Filtering: To incorporate high-pass filtering, set the hpf argument. For example, setting hpf=0.01 will achieve a high-pass filter at 0.01 Hz. Filtering is performed via discrete cosine transform (DCT) regressors, which are included in the model. This is a simultaneous regression approach, which avoids pitfalls associated with modular preprocessing (Lindquist et al., 2019).

Prewhitening: A spatially varying prewhitening technique is implemented in BayesGLM. This approach accounts for spatial variability of residual autocorrelation, and avoids differences in false positive control and power across the brain (Parlak et al., 2023). To perform prewhitening, set ar_order to any positive integer, indicating the AR model order to use. To spatially smooth the AR coefficients, set ar_smooth to a positive value indicating the FWHM of the smoothing kernel in mm. Set aic = TRUE to optimize the AR model order at each voxel/vertex, up to the value ar_order.

Signal Dropout: To exclude voxels/vertices with poor signal, set meanTol and/or varTol. Brain locations with mean/variance below these values will be excluded.

Scrubbing: (Coming soon!) To exclude certain time points due to head motion or other reasons (e.g. drowsiness), set scrub to indicate any time points that should be excluded from analysis. Time points will only be removed after the DCT bases are generated and the prewhitening parameters are estimated, since those steps assume the original temporal structure of the data.

Masking: To exclude certain regions or focus on a specific anatomical area of the cortex or subcortex, you can simply mask out the voxels or vertices to be excluded. By setting the values to zero or NA, they will be ignored by BayesGLM. However, care should be taken to make regions large enough to facilitate spatial model fitting. Regions that are very small may make it difficult for the model to estimate the spatial correlation structure. Regions should be large enough to emcompass the areas of activation within the region, plus background areas of low activation surrounding them.

Parallelization: If you have multiple cores available, you can set n_cores to use parallelization for maximal computational efficiency. By default, up to 4 cores are used.

system.time(
  bglm <- BayesGLM(BOLD=BOLD_xii,
        design=design$design,
        brainstructures="sub",
        subROI=c('Amygdala-L','Amygdala-R','Hippocampus-L','Hippocampus-R'),
        TR=TR,
        nuisance=motion,
        scale_BOLD='mean',
        hpf=.01,
        nbhd_order=1,
        ar_order=3,
        ar_smooth=0,
        Bayes=TRUE,
        verbose=0 ,
        meanTol=1))
##    user  system elapsed 
##  60.230  14.704  44.159
slices <- 24:32 
plot(bglm, Bayes=TRUE, idx="fear", slices = slices)
plot(bglm, Bayes=FALSE, idx="fear", slices = slices)

act <- activations(bglm, Bayes = TRUE, gamma = 0, alpha = 0.05) 
## Identifying Bayesian GLM activations in subcort
act0 <- activations(bglm, Bayes = FALSE, gamma = 0, alpha = 0.05, correction = "FWER")
act00 <- activations(bglm, Bayes = FALSE, gamma = 0, alpha = 0.05, correction = "FDR")
plot(act, idx="fear", title = "Bayesian GLM", slices = slices)
plot(act0, idx="fear", title = "Classical GLM (FWER)", slices = slices)
plot(act00, idx="fear", title = "Classical GLM (FDR)", slices = slices)

system.time(
  bglm1 <- BayesGLM(BOLD=BOLD_xii,
        design=design$design,
        brainstructures="all",
        subROI=c('Amygdala-L','Amygdala-R','Hippocampus-L','Hippocampus-R'),
        TR=TR,
        nuisance=motion,
        scale_BOLD='mean',
        hpf=.01,
        nbhd_order=1,
        ar_order=3,
        ar_smooth=0,
        Bayes=TRUE,
        verbose=0 ,
        meanTol=1))
## Since no surfaces provided or present, I will use the fs_LR inflated surfaces for all modeling.
## Since no surfaces provided or present, I will use the fs_LR inflated surfaces for all modeling.
##    user  system elapsed 
## 420.077  70.379 287.376

Visualize the cortical surface estimates

plot(bglm1, Bayes=TRUE, idx="fear", title = "Bayesian GLM", what = 'surface') 

plot(bglm1, Bayes=FALSE, idx="fear", title = "Classical GLM", what = 'surface')

Visualize the subcortical estimates

plot(bglm1, Bayes=TRUE, idx="fear", title = "Bayesian GLM", what = 'volume', slices = slices) 

plot(bglm1, Bayes=FALSE, idx="fear", title = "Classical GLM", what = 'volume', slices = slices)

act <- activations(bglm1, Bayes = TRUE, gamma = 0, alpha = 0.05, verbose = 0) 
act0 <- activations(bglm1, Bayes = FALSE, gamma = 0, alpha = 0.05, correction = "FWER", verbose = 0)
act00 <- activations(bglm1, Bayes = FALSE, gamma = 0, alpha = 0.05, correction = "FDR", verbose = 0)
plot(act, idx="fear", title = "Bayesian GLM", slices = slices) 
plot(act0, idx="fear", title = "Classical GLM (FWER)", slices = slices)
plot(act00, idx="fear", title = "Classical GLM (FDR)", slices = slices)

Section on BayesGLM2 and Prevalence Maps